Chinese Lexical Sememe Prediction Using CilinE Knowledge

نویسندگان

چکیده

Sememes are the smallest semantic units of human languages, composition which can represent meaning words. have been successfully applied to many downstream applications in natural language processing (NLP) field. Annotation a word's sememes depends on experts, is both time-consuming and labor-consuming, limiting large-scale application sememe. Researchers proposed some sememe prediction methods automatically predict for However, existing focus information word itself, ignoring expert-annotated knowledge bases indicate relations between words should value predication. Therefore, we aim at incorporating into process. To achieve that, propose CilinE-guided model employs an base CilinE remodel from relational perspective. Experiments HowNet, widely used Chinese base, shown that has obvious positive effect prediction. Furthermore, our method be integrated significantly improves performance. We will release data code public.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical Sememe Prediction via Word Embeddings and Matrix Factorization

Sememes are defined as the minimum semantic units of human languages. People have manually annotated lexical sememes for words and form linguistic knowledge bases. However, manual construction is time-consuming and labor-intensive, with significant annotation inconsistency and noise. In this paper, we for the first time explore to automatically predict lexical sememes based on semantic meanings...

متن کامل

A Large-scale Lexical Semantic Knowledge-base of Chinese

The Semantic Knowledge-base of Contemporary Chinese (SKCC) is a large scale Chinese semantic resource developed by the Institute of Computational Linguistics of Peking University. It provides a large amount of semantic information such as semantic hierarchy and collocation features for 66,539 Chinese words and their English counterparts. Its POS and semantic classification represent the latest ...

متن کامل

Using Lexical Knowledge in Text Classification

This paper describes several experiments in text classification using WordNet, a rich source of lexical background knowledge available in the public domain. WordNet is used to map the original words from a text into sets based on synonym and hypernym relationships. This information is used to compute a change of representation from bag of words to hypernym density. Six binary classification tas...

متن کامل

Using Default Logic for Lexical Knowledge

Lexical knowledge is knowledge about the morphology gram mar and semantics of words This knowledge is increasingly important in language engineering and more generally in information retrieval in formation ltering intelligent agents and knowledge management Here we present a framework based on default logic called Lexica for captur ing lexical knowledge We show how we can use contextual informa...

متن کامل

Assessing Chinese Readability using Term Frequency and Lexical Chain

This paper investigates the appropriateness of using lexical cohesion analysis to assess Chinese readability. In addition to term frequency features, we derive features from the result of lexical chaining to capture the lexical cohesive information, where E-HowNet lexical database is used to compute semantic similarity between nouns with high word frequency. Classification models for assessing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

سال: 2023

ISSN: ['1745-1337', '0916-8508']

DOI: https://doi.org/10.1587/transfun.2022eap1074